获取商品信息
获取头文件中的title信息
<title>【图】(上门回收)苹果平板-笔记本-微软平板全系列IPad-Pro-mini4-Air2 - 平板电脑 - 北京58同城</title>
title = soup.title.text
print(title)
【图】(上门回收)苹果平板-笔记本-微软平板全系列IPad-Pro-mini4-Air2 - 平板电脑 - 北京58同城
获取价格信息
<span class="price c_f50">5000</span>
select()返回的是一个列表,如果我们要爬的内容只有一处的话那么返回的列表的长度为1。所以我们直接用索引0进行访问就可以了。
price = soup.select('span.price.c_f50')[0].get_text()
print(price)
5000
获得时间信息
<li class="time" title="发布日期">2016-05-19</li>
time = soup.select('li.time')[0].get_text()
print(time)
2016-05-19
从列表页获取url
写一个函数从列表页面中获得每一个产品的URL。原理和之前相同此处不再赘述。
<a class="title t" href="http://jump.zhineng.58.com/jump?target=pZwY0jCfsvFJsWN3shPfUiqkpyOMmh78uA-6UhO6UztzPWTvnWm3nHEOnW03ng980v6YUykKnH9dPHTLnjNOnHb3P1bLnjDvnjTkP1b3rj9zTHczrHbOn1c3njT3PjNdTEDzPWTvnWm3nHEOnW03nEDzPWmKnH0kTHc1njbYTHDKnHNQPW0knj9vPW01rTDQTyQG0Lw_uyuYTHDKnE7-0MPCULRYudtznjDLnjbkP7q-XZKtnEDVnEDKnTDkTEDQTHP6mHK-mHbLsHmkPhmVPADzuBYOPj91symkPh7hrjE3uHEvnTD1PjEYPjEQPEDznWcvPjnYPjc1rHmYPj9KTHc1njbYTHDKna3znWnLnHNQPHTOrjm3nWN1n9DKsEDKTy6YIZTlszqBpB3draOWUvYf0AFbUBtQsk7WPiq8p-ukUH-vNLKwPMwZEg6HnD7M5HYKnHDYsWcYPz3dPB3QP1EKnTDkTED1rjE3PT7exE7WuWm3n1EknWnQPHDzmHEQ&psid=185507059198797016000798882&entinfo=26062681492781_0" target="_blank" rel="nofollow">(上门回收)苹果平板-笔记本-微软平板全系列I</a>
def get_links_from():
urls = []
list_view = 'http://bj.58.com/pbdn/1/'
wb_data = requests.get(list_view)
soup = BeautifulSoup(wb_data.text,'lxml')
for link in soup.select('a.title.t'):
urls.append(link.get('href'))
return urls
print(get_links_from())
['http://jump.zhineng.58.com/jump?target=pZwY0jCfsvFJsWN3shPfUiqkpyOMmh78uA-6UhO6UztzPWTvnWm3nHEOnW03ng980v6YUykKnHnYnj0kP1mLnHb3P1bLnjcLnHEznWNvPWN3THczrHbOn1c3njT3PjNdTEDzPWTvnWm3nHEOnW03nEDzPWmKnWD3THc1njbYTHDKnHNQPW0knjbdnH01P9DQTyQG0Lw_uyuYTHDKnE7-0MPCULRYudtznjDLnjbkP7q-XZKtnEDVnEDKnTDkTEDQTHIbPAcLuAPbsyNvrHDVPj0LmzdBnyDOsyubuHT1PhF-uhcOuTD1PjEYPjEQPEDznWcvPjnYPjc1rHmYPj9KTHc1njbYTHDKna3QrjnzP1bknW9LPHDvP